Skip to content

fix(content): replace unbounded query loop with LIMIT-1 DB call in validateRelationships() (#35489)#35491

Merged
dsilvam merged 8 commits into
mainfrom
fix/issue-35489-validate-relationships-limit
May 5, 2026
Merged

fix(content): replace unbounded query loop with LIMIT-1 DB call in validateRelationships() (#35489)#35491
dsilvam merged 8 commits into
mainfrom
fix/issue-35489-validate-relationships-limit

Conversation

@fabrizzio-dotCMS
Copy link
Copy Markdown
Member

Problem

Re-saving a contentlet with required many-to-many relationships took up to 10 minutes on environments with ~500 related records and ~40 configured languages. Profiling showed 99% of CPU time in ESContentletAPIImpl.validateRelationships().

Two hotspots caused an N×M SQL query explosion (500 identifiers × 40 languages = 20 000+ queries per save):

  1. Required-relationship existence check (the !foundInRelationships path) called getRelatedContent(), which routes through RelationshipCache and fires one contentlet_version_info SELECT per identifier per language — just to answer a boolean "does anything exist?".

  2. ONE_TO_MANY cardinality check inside the per-child loop also called getRelatedContent() unconditionally for every cardinality, including MANY_TO_MANY, triggering the same cache-miss explosion even when the result was never needed.

Fix

Both calls are replaced with FactoryLocator.getRelationshipFactory().dbRelatedContent(relationship, contentlet, hasParent, false, null, 1, 0), which issues a single LIMIT 1 query directly against tree + contentlet_version_info and stops at the first matching row. The cardinality check is also guarded behind if (ONE_TO_MANY) so it is skipped entirely for other cardinalities.

This mirrors the approach from PR #35407 (which fixed a similar bottleneck in RelationshipFactoryImpl for an older branch) and adapts it to the refactored validateRelationships() in main.

Changes

  • dotCMS/src/main/java/com/dotcms/content/elasticsearch/business/ESContentletAPIImpl.java
    • Replaced getRelatedContent() in required-relationship existence check with dbRelatedContent(…, 1, 0)
    • Wrapped ONE_TO_MANY cardinality DB check in if (ONE_TO_MANY) guard and replaced getRelatedContent() with dbRelatedContent(…, 1, 0)

Acceptance Criteria

  • Re-saving a contentlet with required relationships completes in under 10 seconds (baseline: ~10 min)
  • Total JDBC executions during save are in the hundreds, not tens of thousands
  • Required-relationship validation still rejects saves where a required relationship has no children in the DB
  • Required-relationship validation still passes when the relationship is omitted from the payload but children exist in the DB
  • ONE_TO_ONE cardinality validation still rejects saves that link more than one contentlet
  • Content types with no required relationships are unaffected

Test

./mvnw verify -pl :dotcms-integration -Dcoreit.test.skip=false \
  -Dit.test=ESContentletAPIImplTest#testValidateRelationships

References

🤖 Generated with Claude Code

…uery in validateRelationships() (#35489)

Two hotspots in ESContentletAPIImpl.validateRelationships() caused N×M query explosion
(500 related identifiers × 40 languages = 20 000+ SQL queries per save):

1. The required-relationship existence check called getRelatedContent(), which routes
   through RelationshipCache and fires one query per identifier per language.
   Replaced with FactoryLocator.getRelationshipFactory().dbRelatedContent(…, 1, 0)
   — a single LIMIT-1 query on tree + contentlet_version_info that stops at first hit.

2. The ONE_TO_MANY cardinality check inside the per-child loop called
   getRelatedContent() unconditionally for every cardinality (including MANY_TO_MANY),
   triggering the same cache-miss explosion. Guarded the DB call behind an
   if (ONE_TO_MANY) check and replaced it with the same dbRelatedContent(…, 1, 0) call.

Refs: #35489

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 28, 2026

Claude finished @fabrizzio-dotCMS's task in 3m 19s —— View job


PR Review

  • Read changed files
  • Analyze implementation correctness
  • Check test coverage
  • Post review feedback

Fix is sound for the two targeted hotspots. A few issues worth a second look:


1. isValidOneToOneRelationship() still uses getRelatedContent() — unfixed path

ESContentletAPIImpl.java:8809 (untouched):

List<Contentlet> relatedContents = getRelatedContent(
    contentsInRelationship.get(0), relationship, null,
    APILocator.getUserAPI().getSystemUser(), true, 1, 0, null);

This PR guards the main loop with if (ONE_TO_MANY) but doesn't touch isValidOneToOneRelationship, which still routes through RelationshipCache. For ONE_TO_ONE it's always at most 1 child, so the blast radius is limited — but if the environment also has ONE_TO_ONE relationships at scale, the cache-miss path is still live. Not a regression, but worth noting as a follow-up.


2. Existence check still hydrates a full Contentlet per row

dbRelatedContentByParent / dbRelatedContentByChild internally call APILocator.getContentletAPI().find(inode, ...) for every row returned (line 625 in RelationshipFactoryImpl.java). For the LIMIT 1 existence check, that's 2 round-trips (1 join query + 1 contentlet fetch) instead of the 20 000+ it was, which is a massive improvement — but a pure COUNT(*) or EXISTS query at the DB level would reduce it to 1 round-trip and avoid object allocation entirely. The current approach is acceptable for now, but it's an obvious next step if this path remains hot.


3. null orderBy passed without documentation

Both new calls pass null for orderBy:

dbRelatedContent(rel, contentlet, checkParent, false, null, 1, 0)

SQLUtil.sanitizeSortBy(null) returns null, UtilMethods.isSet(null) is false, so includeContentlet is false and the query falls back to ORDER BY t.tree_order, cvi.version_ts. This is safe, but it's a subtle dependency on internal fallback behavior. Passing "tree_order" explicitly (matching the existing convention in the codebase at lines 2376, 2471) would make the intent clear and future-proof against sanitizer changes.


4. Language semantics silently changed

Original getRelatedContent() routed through Elasticsearch and would have respected language context. The new dbRelatedContent() has no language filter in its SQL — it returns results across all languages. For the "does any related content exist?" check this is actually more correct (don't fail a required-relationship validation just because the current language differs). But the behavioral change is undocumented, and it's non-obvious that this is intentional rather than accidental.


5. Missing test: rejection path for required relationship with no DB content

The first test (validateRelationships_requiredField_existingDbRelation_doesNotThrow) validates that re-save succeeds when content exists in the DB. There's no test verifying the mirror case: a new contentlet (or one with no DB-side relationship) with a required field and an empty payload is still rejected. That's the path at lines 8645-8650. If someone accidentally inverted the isEmpty() check, the existing tests wouldn't catch it.


6. parent2 = next() in ONE_TO_MANY test is an unnamed stub, not a real saved entity

ESContentletAPIImplTest.java:5102:

final Contentlet parent2 = new ContentletDataGen(parentCT).setPolicy(IndexPolicy.FORCE).next();
// ...
contentletAPI.checkin(parent2, Map.of(relationship, list(child)), user, false);

parent2 has no identifier. The cardinality check at line 8709 evaluates !relatedContents.get(0).getIdentifier().equals(contentlet.getIdentifier()) where contentlet.getIdentifier() is null/empty. This comparison returns false (correctly triggering the violation), but for a different reason than intended — it's parent1Id.equals(null) rather than parent1Id.equals(parent2Id). The test would fail to detect a regression where the check uses inode or some other field instead of identifier. Persisting parent2 first would make this a genuine identity comparison.


Nothing in the core logic looks wrong. The two hotspots are correctly addressed, the cardinality guard for ONE_TO_MANY is sound, and catching only DotDataException (dropping DotSecurityException) is correct since dbRelatedContent doesn't declare it. The items above are improvements, not blockers — except #5 (missing rejection test) which is a real coverage gap.

fabrizzio-dotCMS and others added 6 commits April 28, 2026 19:32
…tent migration

- Remove DotSecurityException from catch — dbRelatedContent() uses systemUser
  internally and never throws it
- Replace size() > 0 with isEmpty() for idiomatic Java

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…T-1 fix (#35489)

Three tests in ESContentletAPIImplTest covering the two hotspots replaced in
ESContentletAPIImpl.validateRelationships():

1. validateRelationships_requiredField_existingDbRelation_doesNotThrow — re-checkin
   without contentRelationships must NOT fail when a required child already exists in DB.

2. validateRelationships_oneToMany_childWithDifferentParent_throwsBadCardinality —
   ONE_TO_MANY cardinality check must reject a second parent claiming an owned child.

3. validateRelationships_manyToMany_childSharedAcrossParents_doesNotThrow — the
   ONE_TO_MANY guard must be skipped for MANY_TO_MANY so no false cardinality error
   is raised when multiple parents share the same child.

Refs: #35489

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dsilvam dsilvam added this pull request to the merge queue May 5, 2026
Merged via the queue into main with commit 467b276 May 5, 2026
78 of 85 checks passed
@dsilvam dsilvam deleted the fix/issue-35489-validate-relationships-limit branch May 5, 2026 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Slow contentlet save with required relationships — unbounded query loop in validateRelationships()

3 participants